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Abstract 



We study the capacity of discrete memoryless many-to-one interference channels, i.e., K user interference channels 
where only one receiver faces interference. For a class of many-to-one interference channels, we identify a noisy 
interference regime, i.e., a regime where random coding and treating interference as noise achieves sum-capacity. 
Specializing our results to the Gaussian MIMO many-to-one interference channel, which is a special case of the 
class of channels considered, we obtain new capacity results. Firstly, while previous results characterized noisy 
interference regimes for many-to-one interference channels with inputs having average power constraints, we show 
that this remains valid for a more general class of inputs. This more general class of inputs includes the practical 
scenario of the inputs being restricted to fixed finite-size constellations such as PSK or QAM. Secondly, we extend 
noisy interference results previously studied in interference channels with single antenna nodes at all transmitters, 
to MIMO and parallel many-to-one interference channels. Finally, while previous results considered the Gaussian 
interference channel with full channel state information (CSI) at all nodes, we provide a noisy interference regime 
for fading Gaussian many-to-one interference channels without CSI at the transmitters. 

While the many-to-one interference channel requires interference alignment, which in turn requires structured 
codes in general, we argue that in the noisy interference regime, interference is implicitly aligned by random coding 
irrespective of the input distribution. As a byproduct of our study, we identify a second class of many-to-one 
interference channels (albeit deterministic) where random coding is optimal (though interference is not treated as 
noise). We attribute the optimality of random coding in this second class of channels to the resolvability of the 
multiple interferers at the receiver which precludes the possibility of interference alignment and hence obviates the 
need of structured codes. 



The idea of interference alignment has been recently discovered to play a significant role in the characterization 
of capacity of wireless interference networks [l]-[4]. Interference alignment is the idea that signals are designed so 
that they overlap at receivers where they cause interference while remaining distinguishable at receivers where they 
are desired. In network communication scenarios where receivers face interference from multiple sources, alignment 
compacts the space occupied by the multiple interfering signals and results in increased rates for the messages desired 
at the receiver. Therefore, optimal code design for interference channels typically involves a conflict between the 
need for interference management via alignment at undesired receivers, and the need to maximize rates at the 
desired receiver. This conflict is clearly reflected in the contrast between interference channels, and channels that 
do not require alignment, viz. point-to-point, multiple access (MAC) and broadcast (BC) channels. For instance, it 
is well known that the identically and independently distributed (i.i.d.) circularly symmetric Gaussian distributions 
on the inputs, which maximizes the differential entropy of the received signals, achieve capacity in the point to 
point, MAC and BC channels. However, in contrast, in interference channels, asymmetric complex signaling [5], 
and structured (lattice) codes [4], [6]-[9] have been shown to be useful, especially because they align interference. 
In fact, the lack of a complete understanding of the limits of interference alignment is among the primary hurdles in 
capacity characterizations of wireless networks. In this paper, we will provide a finer understanding of interference 
alignment, and characterize the sum-capacity of a class of discrete memoryless (many-to-one) interference channels. 

The results demonstrating the need for explicit interference alignment via lattice-coding/asymmetric complex 
signaling contrast with the "noisy interference" results for the Gaussian K user interference channel found recently, 



I. Introduction 




Noisy interference condition: 

H12X2 + H13X3 + Z\ is a degraded version of (Y2, 13). 

Fig. 1. The 3-user Gaussian Many-to-one Interference Channel and the Noisy Interference Condition. 



presented in references [10]— [12]. These references showed that for the K user interference channel, if the channel 
gains satisfy certain conditions, then, using random codebooks with circularly symmetric Gaussian distributions 
for all messages and treating interference as noise at all receivers is sum-capacity optimal. Put differently, these 
results indicate that in certain scenarios, explicit alignment in the form of structured (lattice) coding or asymmetric 
signaling is not necessary, and random coding is optimal, even though there is potential for alignment with receivers 
facing multiple interferers. One of the main goals of this work is a better understanding of why random (Gaussian) 
coding is optimal in the noisy interference regime, in spite of the opportunity for alignment. We study this question 
in the setting of the discrete memoryless many-to-one interference channel - the interference channel where only 
one receiver faces interference - which is the simplest setting where a receiver faces multiple interferers. The main 
result of this work is the characterization of a noisy interference regime for a class of discrete memoryless many- 
to-one interference channels. The noisy interference condition obtained here can be loosely described as follows: 
In the many-to-one interference channel, if the effective interference (with noise) seen by the only receiver facing 
interference is a stochastically degraded version of the set of received signals at all other receivers, then, random 
coding at all the transmitters and interference being treated as noise at the receiver facing interference achieves 
sum-capacity. 

The above result, which will be expressed rigorously later (SectionlTIIli. holds for a broad class of discrete memoryless 
many-to-one interference channels including the Gaussian many-to-one interference channel. From the perspective 
of Gaussian interference channels, we make two observations. Firstly, our main result captures the noisy interference 
regime for the single-antenna Gaussian many-to-one interference channel found previously in references [10]— [12]. 
Secondly, while results of references [10]— [12] are mainly restricted to the single-antenna Gaussian interference 
channels with classical assumptions on the model, such as full channel state information (CSI) at all nodes and 
average power constraints on the inputs, our result described above holds for a broader class of discrete memoryless 
many-to-one interference channels, and is therefore more robust to the system model. Specializing our main result 
to the Gaussian setting enables us to extend the noisy interference results to scenarios of practical importance not 
captured by such classical assumptions and thus not previously considered. Before summarizing such extensions, 
we first describe how our main result summarized (in italics) above captures the noisy interference regime for 
many-to-one interference channels previously discovered in [10]— [12]. 

Consider a K user Gaussian many-to-one interference channel (Fig.Q~|i, whose inputs and outputs can be expressed 

as 

Yi(r) = H ii X i (r)+Z i (T),i = 2,3,...,K 

K 

t=l 

where, corresponding to the rth symbol, Xi(r) is the complex scalar input at Transmitter i, Yj (r) is the complex 



scalar output at Receiver j and Zj (r) represents the zero-mean unit- variance circularly symmetric additive white 
Gaussian noise (AWGN) variable at Receiver j. Hji is a complex scalar representing the channel gain between 
Transmitter i and Receiver j. As is standard in the interference channel, Transmitter i has a message to Receiver 
i, which is independent of the message at, and unknown to other transmitters (and unknown to all receivers prior 
to communication). Then, in this channel, with an average power constraint on the inputs, it is shown in [10]— [12], 
that if 

K \TJ 12 



<1, (1) 

/ / : :\- 

3=2 



then, circularly symmetric Gaussian inputs and treating interference as noise at all receivers is sum-capacity optimal. 
We note that the condition in (Q~|) is a special case of the conditions stated in our main result above (in italics), 
i.e., when fl} holds, the effective interference at Receiver 1, V — Y^j=2 HijXj + Z\, is a stochastically degraded 
version of the set of signals received at all other receivers, i.e., (5*2,53, ■ ■ ■ , Yk - )- This is because V is a degraded 
version of $D i=2 rT^Y?'' which is, obviously, a degraded version of (Y 2 , Y3, . . . , Yr"). Thus, we have shown that, 
the noisy interference regime of [10]— [12] is included in the noisy interference regime found in our main result. 
In Appendix [I] we show that, the two regimes - the regime described in ([T]l and the regime described by our main 
result - are in fact equivalent. While our results are equivalent to the results of previous works in the context 
of the classical single-antenna Gaussian many-to-one interference channels, as mentioned earlier, our result holds 
for a more general class of Gaussian many-to-one interference channels. Specifically, our results extend the noisy 
interference regime to many-to-one interference channels beyond the classical assumptions. We summarize such 
extensions below. 

• Previous works [10]— [12] consider Gaussian interference channels where the input alphabet is continuous and 
there is an average power constraint on the input codewords. For these channels, the references show that 
for certain values of channel gains, random Gaussian codebooks and treating interference as noise is optimal 
when there is an average power constraint on the input codewords. In practice, however, input signals are 
typically restricted to fixed finite-size constellations such as PSK, QAM etc. It is not clear whether the noisy 
interference results results carry forward to the more practical setting of the inputs being constrained to fixed 
constellations. In fact, there remained open the question of whether there even exists a non-trivial set of channel 
gains where random coding and treating interference as noise achieves sum-capacity in this setting. In this 
work, we settle this open question by showing that the noisy interference regime remains valid in the Gaussian 
many-to-one interference channel even if the inputs are restricted to fixed constellations. In other words, if 
the channel gains satisfy the conditions of (|T), then, random codebooks generated i.i.d with the appropriate 
distribution at the inputs, and treating interference as noise at the receiver facing interference achieves sum- 
capacity - even if the inputs are restricted to fixed constellations. Therefore, in the noisy interference regime, the 
capacity characterization problem is essentially reduced to the problem of determining the optimal single-letter 
distribution on the inputs. 

• The results of [10]— [12] are for interference channels with a single antenna at each node - the question of the 
existence and characterization of noisy interference regimes for MIMO interference channels remains open. 
In this work, we (partially) address this open question by characterizing a noisy interference regime for the 
MIMO Gaussian many-to-one interference channel. Note that extensions of [10]— [12] have been proposed to 
two user MIMO interference channels [13], [14]. Our result differs from the result of [13], [14] in that, we 
present a noisy interference regime for the K-usei interference channel, albeit not fully connected (since we 
only consider the many-to-one interference channel). It must be noted that the noisy interference regime for 
the MIMO setting also remains valid for input signals being restricted to finite constellations. 

« Previous noisy interference results are presented for the case where the channel is constant (i.e., not fading), 
and when transmitters and receivers have channel state information (CSI). In this paper, we obtain a noisy 
interference regime for the fading Gaussian many-to-one interference channel where transmitters do not have 
CSI, and only the receivers have CSI. 



'We have so far only discussed the optimality of random coding and treating interference as noise in our main result. References [10]— [12] 
also show the optimality using the circularly symmetric Gaussian distribution in the noisy interference regime; this optimality will be shown 
for Gaussian channels in our characterization as well in a formal description of our result in Section Hill 



• Previous results [15], [16] have shown that parallel (i.e. multi-carrier) Gaussian interference channels (including 
many-to-one interference channels), unlike point-to-point, multiple access and broadcast channels, are in 
general inseparable, i.e., joint coding over the multiple carriers is required to achieve sum-capacity in parallel 
interference channels. While parallel interference channels are in general inseparable, under certain special 
conditions, they are separable, i.e., separate (independent) coding over the various carriers (and in fact, treating 
interference as noise) achieves sum-capacity. Such conditions have been identified for parallel single-antenna 
Z interference channels in [17] and for the (fully-connected) 2-user Gaussian interference channels in [18]. In 
this paper, we extend the results of [17] to Gaussian MIMO many-to-one interference channels. In particular, 
we show that, under the special case that the many-to-one interference channels formed over each of the 
carriers forming the parallel channel satisfies our noisy interference conditions, the channel is separable from 
a sum-capacity perspective. For example, in the single-antenna Gaussian many-to-one interference channel, 
if the channel gains on each of the carriers satisfy the condition of (Q]), then separate random coding and 
treating interference as noise achieves sum-capacity. Therefore, in this case, with an average power constraint 
on the input, the sum-capacity of the parallel many-to-one interference channel is the sum of the capacities 
of the various individual carriers under an optimal power allocation - much like the point-to-point, MAC and 
BC channels. Further, this separability result is not limited to the average power constraint on the inputs, and 
holds even for inputs of fixed finite constellations. It must be noted that our main result automatically implies 
that random coding and treating interference as noise over such a channel (where each carrier satisfies our 
noisy interference criterion) achieves sum-capacity, because the required degradedness condition holds for the 
parallel channel. But our main result described above does not, however, imply their separability - the property 
that the optimal distribution used in random coding has the input over each carrier independent of the input 
of the other carriers. The separability is an additional result shown in Section UlI-BI 

Why is explicit interference alignment not required in noisy interference regimes? 

An important insight to emerge from this work is that in the noisy interference regime, interference is aligned 
implicitly via random codes. The idea of interference alignment with random codes can be understood in the 
following setting. Consider a receiver receiving multiple signals coded from a codebook generated in the classical 
random coding fashion. If the cardinalities of the codebooks corresponding to these signals lie in the achievable 
random coding rate region (with the corresponding input distributions) of the multiple access channel formed at 
the receiver, then the receiver can resolve these multiple signals with high probability. In other words, the signals 
are not aligned. On the other hand, if the cardinalities of the codebooks lie outside this achieved rate region of the 
multiple access channel formed at the receiver, then the signals align. In fact, in this scenario, the signals cannot 
be resolved uniquely at the receiver, because the signals align. While alignment is not a desirable phenomenon 
if the receiver intends to resolve the signals as is the case in the multiple access channel, it is beneficial if the 
signals are interfering at the receiver as is the case in interference channels. In the noisy interference regime for 
the many-to-one interference channel, we show that because of a degraded nature of the channel, interference can 
be aligned with random codes for any distribution on the inputs. In particular, in the Gaussian channel, interference 
is aligned, even with random coding and with the circularly symmetric Gaussian distribution; thus, alignment is 
implicit in this case. 

It must be noted that the optimality of random coding in the noisy interference regime is desirable from two 
perspectives. First, the generality of random coding argument enables us to present results for a fairly broad class 
of channels which may or may not be linear (though we later specialize our results to the linear Gaussian setting). 
Secondly, the optimality of random codes enables a single-letter characterization for the capacity, unlike in channels 
which need structured codes, where single-letter characterizations may not even exist [19]. 

The idea of implicit interference alignment in the noisy interference regime is examined more closely in Section 
HVIbv specializing the noisy interference regime to a deterministic many-to-one interference channel. In this setting, 
we observe that if a random code transmits at sufficiently high rates, then the interference becomes noisy and 
the extent of alignment via random codes is optimal. The simpler setting of the deterministic channel, apart from 
enabling a better understanding of the idea of alignment via random coding, allows two other interesting insights 
into interference alignment. Firstly, we find that random coding achieves capacity in a scenario where the multiple 
interferers are resolvable at the receiver facing interference (The idea of resolvable interference has been earlier 
used to determine the capacity of a class of symmetric deterministic interference channels in [20]). The resolvability 




Fig. 2. The 3-user Many-to-one Interference Channel 



of interference precludes the possibility of interference alignment which enables a characterization of its capacity 
region. Secondly, a combination of insights from the noisy interference regime and the resolvable interference 
regime enables us to provide, in Section IIV-CI a (partial) answer to the question : How many bits of additional rate 
can interference alignment provide on the many-to-one interference channel? 

We now proceed to the next section where we formally define the discrete memoryless many-to-one interference 
channel - the basic setting of all the results of this paper. 

II. System Model : A Class of Discrete Memoryless Many-to-One Interference Channels 

The K user discrete memoryless many-to-one interference channel (Figure |2]i is defined by a set of K inputs 
Xj G Xi, and a set of K outputs G 3^ for i — 1,2, ... ,K. In the class of many-to-one interference channels 
considered, the outputs Yi, i = 2, 3, . . . , K are generated using the distributions pyAXi- The output Yi is generated 
as 

Y 1 =f 1 {X 1 ,V), 

where V G V is generated using Pv\x 2 ,x 3 ,...,x K - We assume that V is invertible from (Yi,Xi), i.e., there exists a 
function f^ 1 such that 

V = fr 1 (Y 1 ,X 1 ). (2) 

There are K independent messages, with message Wi G W; generated at source i G {1, 2, . . . , K}, with each 
message being uniformly distributed over the corresponding message set. A code of length T symbols consists of 
encoding functions (or equivalently, codebooks) <pi : Wi — > Xj and decoding functions ipi : yf — > Wi for all 
i G {1,2,... K}. It is assumed that all the codebooks, i.e., all the mappings fa, i = 1, 2, . . . , K, are known to all 
the decoders. We restrict our study to channels and constraints on codewords which ensure that one of the following 
two sets of quantities exist 

• H(Yf) and H(Y^\X^f) exist for i — 1, 2, . . . , K. Note that using i = 1, and ©, this automatically implies 
that H(V T ) exists. Also note that the this condition captures all channels where the alphabets Xi^i are finite. 

• hlY- 1 ') and h(Y^ r \X^f) exist for i = 1,2, ... ,K. Note that using i = 1, and (O, this automatically implies 
that h(V T ) exists. 

(T) 

The average probability of error of the code P e is defined to be the probability that the set of decoded messages 
is not identical to the set of encoded messages, i.e., 

PP £ Pr ({3i G {1, 2, . . . , K}\^{Y?) jt Wi}) . 

The rate of the code is the tuple R = {R\, R2, ■ ■ ■ , Rk), where Ri = log ^ /v 'l , with \Wi\ denoting the cardinality 
of the message set Wi- A rate-tuple R is said to be achievable if there exists a sequence of codes, all of rate R, 



such that average probability of error vanishes asymptotically, as the sequence index increases. Let C be the closure 
of the set of all achievable rate tuples. The sum-capacity Cs of the interference channel is defined as 

K 

Cs = max y Ri. 
(R 1 ,R 2 ,...,R K )eCl—' 

^— 1 

Notation: We use the notation A T to denote (A(1),A(2), . . . , A(T)) e A T for any random variable A. The 
calligraphic notation is used to indicate sets. The notation W(/i, A) is used to indicate a circularly symmetric 
complex Gaussian random vector with mean /i and covariance matrix A. 7jy is use d to denote the N x N identity 
matrix. The following quantities are also used in the paper. 

/Ci = {2,3,..., A"} 
K = /CiU{l} 
Y A = {Y^it A},A<ZK. 

Before we proceed, the following points must be noted. 

• The channel studied here is a natural adaptation, to the many-to-one interference channel setting, of the 2 user 
interference channel studied in [21]. 

> In the special case where all the alphabets Xi, 3^i, V are finite for all i, and all the distribution functions are 
deterministic (i.e. H(Yi\Xi) — H(V\X2, X%, . . . , Xk) — for all i = 2,...,K irrespective of the input 
distribution), the channel is an adaptation, to the many-to-one interference channel setting, of the class 2 user 
deterministic interference channels studied by El Gamal and Costa in [22]. 

• As we describe next, the MIMO Gaussian many-to-one interference channel is a special case of the class of 
channels described above. 

The MIMO Gaussian Many-to-One Interference Channel 

Consider the MIMO Gaussian interference channel with Mj antennas at Transmitter i and Ni antennas at Receiver 
i so that Xi C C M \ y t = C N * 

Yi(r) = HaXiW + ZiiT^i&lCi (3) 

Yi(t) = ^2BuXi(T)+Zi(r),ieKi (4) 

ieic 

where, corresponding to the rth symbol, Xi is the Mi x 1 vector representing the input at Transmitter i, Yi , Zi are 
the JVj x 1 vectors representing the output and the additive white Gaussian noise vectors at Receiver i. We assume 
that all the noise vectors are circularly symmetric with zero mean and a covariance matrix of identity. This channel 
can be reduced to the channel defined previously (Figure |2]i if we set 

K 

V = ^HuXi+Zx, 

2=2 

f(X l7 V) = H U X 1 + V, 

where V is a N-y x 1 vector. Note that since the constraints on the inputs Xj are fairly general, we capture most 
scenarios of interest such as inputs from a finite fixed-size constellation (Example : PSK, QAM) and inputs with 
an average and power constraints on the codeword. Further, for the special case where Xi = C Mi and an average 
power constraint is imposed on the input codewords, we will give an explicit expression of the capacity of the 
channel in the noisy interference regime (to be defined later) in terms of the powers Pj, where 



E 



7^ EHX^t) 
r=l 



<Pi. 



and T denotes the length of the codeword. 



III. A Noisy Interference Regime for Many-to-One Interference channels 

Before we proceed to the main result, we introduce a lemma which is useful in the proofs. 

Lemma 1: Consider random sequences A T ,B T , C T such that A G A, B G B,C G C, and B T , C T are generated 

T 

as p(B T ,C T ,A T ) = p(A t )Y[p(B(t)\A(t)) P (C{t)\B{t)), where P (B(t)\A(t)) = Pb\a,p(C(t)\B(t)) = 

r=l 

Pc\b< f or a M T — 1) 2, ■ ■ ■ , T. Note that the sequence A T does not have to be generated in an i.i.d fashion. 
The sequences B T ,C T can be interpreted to be outputs of a physically degraded discrete memoryless broadcast 
channel whose input sequence is A T . Also note that A T — > B T — > C T . Now, if H(B T ), H(C T ) exist, then 



T 

E 

r=l 



H(B T ) - H(C T ) < V (H(B(r)) - H{C{r))) 



If h(B T ),h(C T ) exist, then 



T 

E 

T = l 



h(B T )-h(C T )<y](h(B(r))-h(C(r))) 



Further, suppose A = C , Pb\a ~ A/"(0, Ai) anc/ Pc|s ~ A/"(0, A2), or equivalently, let there be variables 
Zi, i = 1, 2 generated i.i.d according to Zi ~ A/"(0, A,), i S {1, 2}, ant/ Vt G {1,2,..., T} 

B(r) = A(r)+Zi(r) 

C(r) = B(r) + Z 2 (r) 

AZso, /ef consider a covariance matrix constraint on the sequence A T , i.e., 



E 



if>(r)A(r) 



for some covariance matrix T, then i.i.d circularly symmetric Gaussian distribution on A T maximizes the quantity 
h(B T )-h(C T ). 

The proof of the lemma is placed in Appendix [TT] The reader may note that a special case of the above lemma, 
where A = B = C = C and Pb\A< Pc\b are both Gaussian, is used in showing previous noisy interference results 
for SISO interference channels [10]— [12]. We now present a noisy interference regime for the discrete memoryless 
many-to-one interference channel considered in this paper. 

Theorem 1: In the many-to-one interference channel defined previously, if Pv\x lCl is a degraded form of 
Py k , 1 \x^ 1 = Ilie/Ci PYi\Xi then, its sum-capacity can be achieved with random coding and treating interference as 
noise, and it can be expressed as 



ieic 



I{Xi;YA, 



(5) 



where the maximization is carried over all probability distributions on the input which facto rize as ]X=i.P^V 
The theorem is proved in Appendix [Til] We now apply the above result to the Gaussian setting below. 

Corollary 1: Consider the Gaussian many-to-one interference channel as defined by If for i G fci, there 

exist covariance matrices Aj so that 

• EieKi A i ^ Iffi and 

• Ui — HuXi + Zi is a stochastically degraded version of Yi, where Zi ~ A/"(0, Ai), 

then, a single-letter input distribution on inputs with Receiver 1 treating all interference as noise is sum-capacity 
optimal. Further, if Xi — C Mi with a power constraint Pi on the input codeword Xj, then the optimal input 
distribution on the input is Gaussian, i.e., the capacity of the channel can be expressed as 

t / \ \ 



K 

Cs = max > log 

ri,ti-(r«)<p»,i£K;-f— \ 



dot il Ni + H ^kH} k 

\ keK. J 



det I 



\ 




with H i3 — if i £ {1, j}. 

The proof is almost identical to Theorem Q] on noting that V is a degraded version of Uk, x = (U2, U3, . . . , Uk), 
which is in turn, a degraded version of Y^, effectively implying that V is a degraded version of Y/^. For 
completeness, we provide the proof in Appendix [IV] 

Remark: In this case, if the inputs come from a finite constellation such as PSK/QAM, then the capacity 
characterization essentially involves determination of the optimal input distribution which maximizes ©. Unlike 
the point-to-point channel, it is not clear even for symmetric constellations such as BPSK, whether the optimal 
distribution is uniform on the inputs. 

Remark: The noisy interference condition of Theorem Q] and the corresponding sum-capacity characterization 
remains unchanged, even if Transmitter i G K,\ each had an independent message for Receiver 1, along with the 
usual message for Receiver i, so that there are 2K — 1 messages in the system - i.e., even if each link of the many- 
to-one channel carried a message. In this channel, with the channel satisfying the conditions of the theorem, all the 
messages to Receiver 1 from Transmitter i ^ 1 will be set to null so that the channel operates as an interference 
channel, for sum-capacity. The proof is almost identical as for the interference channel, with minor adaptations 
which are demonstrated in [23]; the reference showed that the noisy interference regime derived for the two-user 
interference channel in [10]— [12] remains unchanged even if each transmitter had a message to all receivers to form 
the two-user X channel. 

A. Examples 

Example 1 - SIMO Gaussian Many-to-one Interference Channel: The above theorem generalizes the noisy 
interference regime for many-to-one interference channels shown in [10]— [12] . To see this, consider the SIMO 
many-to-one interference channel where all the inputs are one-dimensional scalars, whereas all the outputs are 
vectors. In this case, note that the channel from Transmitter i to Receiver j can be represented by the vector Hji. 
Without loss of generality, let us assume that ||ifjj|| 2 = 1. In this case, it can be verified that, if 

£ ll#i#<i, 

then treating interference as noise is optimal. This can be seen with the auxiliary variables Zi ~ A/"(0, ||iJii|| 2 ) 
and, as mentioned in the above corollary, Ui = HuXi + Zi. For the SISO case, the above condition boils down to 
the conditions specified in [10]— [12]. 

Example 2 - MIMO Gaussian Many-to-one Interference Channel: Consider a MIMO interference channel where 
Mi = N t — M,Vi G fCi and A?i=l. Since a MIMO M x M channel can be decomposed into M parallel links 
using singular-value decomposition, we can assume that the channel matrices Hu , i G Ki are diagonal without loss 
of generality. Let denote the fcth diagonal entry of Ha. Hu, i € K,\ is a M X 1 vector, whose fcth entry, we 
denote by H u ■ Now, if 



M |o-( fe )|2 

Ey \ H u I 
^— J 1 ff( fe ) 12 

ieTCl k=l \ n ii I 



< 1, 



/ M I 2 \ 

then, treating interference as noise is optimal. This can be noted by setting Zi ~ M I 0, J2k=i g w 2 J an< ^ 

Ui = H u Xi + Zi, for i G fCi. 

Example 3 - Fading Gaussian Many-to-one Interference Channel without CSIT: Consider a SISO Gaussian 
interference channel with Rayleigh fading. In this case, let the received signals maybe expressed similar to ©-(Hji, 
where all the quantities are scalars; the only difference being that, in this case, the channel fade ify is time-varying. 
Specifically, the input-output relations can be expressed as 

Yi(r) = Hn(r)Xi(T) + Zi(r),i G ACi 
Ki(t) = £ffi l (r)X i (r) + Z 1 (r),ze/C 1 , 

where, all the receivers have 1 antenna so that the inputs Xj, outputs Yi and the channel gains Hji are scalars for 
all % G /Ci and j = 1 or j = i. We assume that Hij (r) is drawn i.i.d according to a circularly symmetric Gaussian 
distribution Af(0, erf A. Note that this is the classical Rayleigh fading model, where the magnitude of the fade \Hij\ 



is Rayleigh distributed with parameter cry. We assume that is independent of H^/ji if i 5^ i or j '• ^ j . The 
receivers are aware of the channel state information, or equivalently, the effective output at Receiver j, corresponding 
to the rth symbol can be expressed as (Y^(t), W(t)), where TL(t) = {Hij(r) : i = 1, or , z = j, i, j G /C, }. We 
assume that the transmitters do not have CSI, so that the input codewords are independent of the channel gains. 
Suppose 2j=i ~ ^ 1> men > the noisy interference condition is satisfied. This can be verified on noting that, 
if <7jj satisfy the specified condition, the effective interference at Receiver 1, (V,7i), is a degraded version of 
(Sied f 11 ^'^)- Thus, this condition on the variances cr^ provides a condition for noisy interference in the 
fading Gaussian many-to-one interference channel without CSIT 

Example 4 - Collision-based Interference Channel Model: Here, we construct a collision-based model for the 
many-to-one interference channel. Intuitively, the model can be explained as follows. Transmitter i G K\ has two 
choices - it can transmit a symbol from a finite set X i , or it can choose to remain silent. The signal transmitted 
by Transmitter i G /Ci is received perfectly at Receiver i. At Receiver 1, there are two possibilities: the signal 
transmitted by Transmitter 1 can be received perfectly, or it can be erased (due to a collision). The probability 
of collision/erasure can be designed based on the set of transmitters which are silent. Formally, the model can be 
constructed as follows. Consider a K user many-to-one interference channel, where Xi = {</>} U X i7 where X i is 
a finite set which does not contain the element tj>. The element <fi is used to indicate the case where user i remains 
silent (Note that this symbol can be used in the code at Transmitter i to convey information to the Receiver i). The 
received signals are defined as Yi — Xi for i G K,\, and V = {0,e} with 

Also, V is drawn based on any probability distribution p v \ XK - In this channel, V — e can be interpreted as an 
occurrence of a collision at Receiver 1. In this channel, clearly, V is a degraded version of and therefore, the 
search for the sum-capacity of this channel is reduced to the search of the optimal single-letter distribution on all 
the inputs. Note that this model captures the traditional collision based medium access models - this can be seen 
by setting V — e deterministically, when any of the users that 'collide' with user 1 are transmitting any symbol 
other than <f>, and setting V — otherwise. With the optimal input distribution, since treating interference as noise 
is optimal at Receiver 1, the receiver effectively observes a binary erasure channel, with the erasure probability 
calculated based on the input distribution and Pv\x K - 

B. Separability of Parallel Noisy Discrete Memoryless Many-to-one Interference Channels 

In this section, we consider a parallel extension of the class of discrete memoryless many-to-one interference 
channels introduced in Section [TT] Specifically, we show that if the many-to-one interference channel formed over 
each carrier (parallel component) of the parallel channel satisfies the noisy interference condition of Theorem [TJ 
then separate random coding - sending independent random codewords over each carrier - along with treating 
interference as noise is sum-capacity optimal. We now proceed to describe the model and our result formally. 

The class of F-carrier parallel K-usei discrete many-to-one interference channels considered can be represented 
by the set of K (vector) inputs X t = (xf^xf \ . . . , x\ F) ) G X^ x X^ . . . X^ and a set of K (vector) 
outputs Yi = (yP,Y?\ Y t {F) ) G y {1) x . . . y( F ) for i = 1,2,..., K. The fcth component of output, 
Y^ is determined by the fcth component of the inputs Xj k \j = 1,2,..., if by any member of the class 
of discrete memoryless many-to-one interference channels defined earlier in Section [II] Note that the variables 
V = {V^^V^,...^^) and the functions h = (A (1) , fx" 1 , ■ ■ ■ , A (F) ) are used in defining the output are 
Receiver 1, with , f[ k ^ used in defining the fcth component of the output. There are K messages in the system 
Wi, W2, ■ ■ ■ , Wk- The definition of a code of length T, the probability of error, the corresponding rate of the code, 
the capacity region and the sum-capacity are defined similar to the Section [TT] The only difference is that, here, we 
restrict our study to constraints on the codewords, which ensure that either 

. H (r/ fc)T ) and H (y/ fe)T |X 2 (fc)T ) exist for k = 1, 2, . . . , F and i = 1, 2, . . . , K, or 

. h (r/ fe)T ) and h {y^ \x[ k)T ^j exist for k = 1, 2, . . . , F and i = 1, 2, . . . , K, 

where, as before, T denotes the length of the codeword. The class of parallel discrete memoryless many-to-one 
interference channels considered here is, in fact, a special case of the class of channels defined in Section [II] Before 
we proceed to our main result, we provide a parallel extension of Lemma [TJ 



Lemma 2: Consider random sequences A? , B T , C T such that A — (A^ , , . . . , ) and similarly, B and 
C represent F -dimensional vectors/tuples. B T ,C T are generated as 

P^,^,^) = p(#) H f[ P (B^(r)\A^(r))p(c^(r)\B( k Hr)) , 

k=l t=1 

where p (bW (t)\AW (r)) = p^Lp (C (fc) (r) |B ^ (r)) = p^L, for all r = 1,2, ... , T, fe = 1,2, . . . , F. Afore f/zaf 
?/ie sequence A T does not /zave fo fee generated in an i.i.d fashion. The sequences B T , C T can fee interpreted to be 
outputs of a physically degraded parallel ( F -carrier) discrete memoryless broadcast channel whose ( vector) input 
sequence is the A T . Also note that A T -> B T -> <7 T . Afow, if H(B^ T ), H(C^ T ) exist for all k = 1,2, F, 
then 

h(b t )- H (c T ) <J2J2{ H ( B(k) w) - * w) ) 

fe=l T=l 

// h(B^ T ), h(C^ T ) exist for k = l,2,...,F, then 

h(B T ) — kc t ) <J2J2{ h { Bik) w) - ft w) ) 

fe=l T=l 

Further, suppose that the channels describing B and C are Gaussian parallel broadcast channels, with an average 
covariance constraint on the input corresponding to each carrier, i.e., with 

T 



E 



Lj2U (k \r)A^(r) 



= r (*0 



for some set of covariance matrices T^ k \ k — 1, 2, . . . , F, then i.i.d circularly symmetric Gaussian distribution on 
A T , with each A^ independent of A^ for k ^ k maximizes the quantity h(B T ) — h(C T ). 
The proof of the lemma, which is similar to the proof of Lemma Q] is placed in Appendix [V] 

Corollary 2: Consider a parallel discrete memoryless many-to-one interference channel, where each of these 
parallel channels satisfy the noisy interference conditions of Theorem [7] i.e., where is a degraded version of 

(k) _ (k) 

Y£ w.r.t Xy- . Then separate random coding over each of the parallel carriers and treating interference as noise 
achieves sum-capacity. The sum-capacity Cs can therefore be written as 

fe=l i=l 

The proof of the above corollary omitted here, since it is almost identical to the proof of Theorem [1] with Lemma 
|2] used in the proof, in place of Lemma Q] Since the class of parallel many-to-one interference channels described 
above is a special case of the class of many-to-one channels described in Theorem Q] the optimality of random 
coding and treating interference as noise simply follow from the theorem - the additional insight of the above 
corollary is that the optimal input distribution involves the principle of separate coding, i.e., in the optimal input 
distribution, X^ is independent of X^ k ' for k ^ fc for all i e K., The above corollary automatically implies that a 
set of parallel Gaussian many-to-one interference channels, each of which satisfies the noisy interference condition, 
is separable. 

IV. On Interference Alignment In Noisy Interference Regimes and other Insights from 
Deterministic Many-to-One Interference Channels 

The deterministic many-to-one interference channel is the channel as described earlier, where Afj,3^,V are all 
finite and 

H(Y i \X i ) = H(V\X Kl )=0,yie}C 1 , (6) 

for all possible distributions on the input. Note that in this channel, the outputs can be uniquely determined from 
the set of inputs of the channel. Also note that this class of channels captures the deterministic framework proposed 
by [24], and studied in the many-to-one interference channel setting in [4]. We next proceed to understand the idea 
of interference alignment via random codes in this deterministic framework. 



A. Discussion : Why is explicit interference alignment not required in the noisy interference regime? 

For the deterministic many-to-one channel defined in (0, we present the noisy interference regime in a slightly 
different form which leads to interesting interpretation later in this section. 

Corollary 3: In the many-to-one interference channel, if there exists a function q such that V = q(Y/c 1 ), then 
the sum-capacity is given by 

C s = max H(Yi) + H(Y Kl \V) 

n« 6K p(^) 

Proof: The proof follows from setting H{Yi\Xi) = H^lX/c^ = for all i e /Q in Theorem Q] and noting 

that 

]T H(Xi) - H(V) = H(Y Kl ) - H(V) = H{Y Kx \V) 
ieic 

since Yi is independent of Yj for i ^ j, and V is a deterministic function of Y^. ■ 
Remark 1: The above class of channels can be considered to be weak interference channels, because the condition 
of the result implies that the effective interference at Receiver 1 must be reconstructible from signals received at all 
receivers i ^ 1. The fact that random coding and treating interference as noise in the weak many-to-one interference 
channels is optimal can also be verified in the class of deterministic many-to-one interference channels studied by 
Bresler, Parekh and Tse [4]. 

For a better understanding of the noisy interference regime, let us take a closer look at the idea of interference 
alignment over the deterministic 3-user many-to-one channel. Over this channel, consider a random coding scheme 
of length T, such that it generates 2 TR2 typical sequences of X^ and 2 TRs typical sequences of Xj . Since 
messages in the system are distributed uniformly, this means that H(Xf) = TRi,i = 2,3. Now, if (R2,Rs) lie 
in the achievable rate region (with these distributions) of the multiple access channel formed with inputs X2 , X3 
and output V, then the sequences X£ and are invertible (i.e., decodable) from V T . This is the case because 
random coding is optimal in the multiple access channel. In other words, the sequences X 2 ,X 3 are not aligned; 
in fact, they are resolvable from V , and each V N sequence therefore corresponds to a unique X 2 , X 3 sequence 
pair with high probability. Such codewords would satisfy 

H{V T ) « T(R 2 + R 3 ) = H(X^) + H(Xj) 
= H(X^Xj) 

The approximation sign is used above rather than equality, since the comparison is in an asymptotic sense. Contrary 
to the above scenario, if (R2, R3) lie outside the rate region achieved with these distributions in the multiple access 
channel, then even with random coding, the sequences Xj , X 3 align. In particular, in the noisy interference regime, 
V = q(Y2, 13) and in achievable scheme, Ri = H(Yi), which imply that 

H{V T ) < T(R 2 + R 3 ) = H(Y?, Y 3 T ) 

as long as q is a non-invertible function (The case of q being invertible falls in the class of the resolvable interference 
regime discussed later in this section). Note that the above condition holds in the noisy interference regime, for 
every possible input distribution. Thus, explicit alignment by an appropriate choice of input distribution or using 
multi-letter based structured coding is not required, and random codes automatically align interference in the above 
channel. The noisy nature of the interference can be explained by the insights of reference [25], which noted that 
on a single-user channel, if a random code of a rate higher than a user's capacity is used, then the signal loses 
structure in the sense that the output satisfies an equipartition property independent of the codebook used. This 
loss of structure can be used to explain the noisy nature of the interference, on noting that for any given input 
distribution, with random coding, Ri — H(Yi) > I(V; Xi\X/c-{i}) — H(V\X^^) in the noisy interference 
regime; in other words, from the perspective of a receiver with output V, the rate of transmission of the user is 
higher than the corresponding user's mutual information and thus the interfering signal loses any structure imposed 
by its codebook. The additional insight here is that, if the rate of each incoming signal at a receiver is higher than 
that user's mutual information, then, not only does the signal lose its structure, but the multiple signals also align. 
In fact, we will argue later in this section (Section [IV- At that the extent of alignment is also the maximum possible 
in this case. These insights carry through to the Gaussian case as well, where, if each user i G JCi transmit at rates 
corresponding their single-user capacity, then the interference is noisy, and is aligned - even by circularly symmetric 
Gaussian distributions at all inputs. 



B. Resolvable Interference Regime for Deterministic Many-to-one Interference Channels 

In this class of many-to-one interference channels, interference cannot be aligned since the multiple interferers 
at the first receiver are resolvable. Since alignment is not possible, random coding achieves the capacity region. We 
first show inner and outer bounds for the deterministic many-to-one interference channel, respectively, in Theorems 
|2] and [3] We then find conditions where these bounds are tight to define the resolvable interference regime in 
Corollary [4] The bounds and the regime are all defined in terms of auxiliary variables U/c 1 = (U2, E/3, . . . , Uk) 
such that 

• Ui is a deterministic function of Xi and 

• V is a deterministic function of (U2, U^, . . . , Uk)- 

Note that U = Xi provides a trivial assignment of auxiliary variables U. However, the bounds can be optimized 
over all possible choices of U satisfying these properties. We now proceed to describe an outer bound on the 
capacity region of the many-to-one interference channel. 

Theorem 2: The capacity region of the deterministic many-to-one interference channel lies in the convex hull 
of the following region, over all possible product distributions Y\.iG.K,P x % (xi)- 

Ri < H(Yi\V) (7) 
Ri < H{Yi),i&K,x (8) 
Ri + J2 R * < HiY^Usc) + H{Y S \V, U S c) CJC 1: (9) 

where S c represents the complement of S w.r.t K,\. 
The above outer bound is proved in Appendix [VI] 

We now describe below, a rate achievable in general, in the deterministic many-to-one interference channel using 
a random coding scheme which does not align interference. The achievable scheme is similar to the one presented 
in [26] in the context of the deterministic Z channel. 

Theorem 3: For the deterministic many-to-one interference channel, the convex hull over all product input 
distributions Y\ i& x.PXi{xi), of the following rate region is achievable. 

Ri < H^V) (10) 
Ri < H(Yi),ielCi (11) 

V5C/Ci,5 = /Ci-5. (12) 
The proof is placed in Appendix I VIII It should be noted that the above achieved rate region is loose, in general, 
with respect to the bound of Theorem [2] However, if H(Us\V) — for all S C /Ci, then the achieved rate region 
can be verified to be optimal by comparing (|7]i-(|9]l with ([Toli- (fT2l . We state this formally below. 

Corollary 4: Consider a many-to-one interference channel where Ujc x is invertible from V, i.e., H{UK, t \V) = 
for all possible input distributions. Then, the capacity region of the many-to-one interference channel is given by 

dED-Ot). 

Proof: Note that it is sufficient to show that the right hand sides of (fT2l and (O are equal. We show this 
below. 

H(Xx\U S o) + ^2H(X i \U i ) 

= H{Y 1 \U SC ) + H(Y S \U S ) (13) 
= H(Y 1 \U S o) + H(Y s \U s U Sa ) (14) 
= H(y 1 \Us.)+H(Y s \V,Us,U S o) (15) 
= H(Y 1 \U S o)+H(Y s \V,Uso). (16) 

In d!31 l. (ll41 l. we have used the fact that (Yi, U) is independent of Yj, Uj for i 7^ j,i,j G /Ci. We have also used 
the fact that Us is invertible from V in the final equation above. ■ 
Note that U can be interpreted as the effective interference caused by Transmitter i ^ 1 at Receiver 1. Also, 
note that with any achievable coding scheme in this channel, Receiver 1 can decode Xf , and because of iff]), 



invert V T in this channel. The condition that Ujc is invertible from V means that all the interfering signals Uf are 
resolvable at the first receiver, and hence alignment is precluded irrespective of the coding scheme used. Therefore, 
not surprisingly, the random coding achievable scheme of Theorem [3] is optimal for this class of channels. 

It must be noted that, the achievable schemes of Theorem [3] and Corollary [3] are both different random coding 
schemes. The schemes differ, in particular, in the decoding procedure at Receiver 1 and hence achieve different 
rates. In the achievable scheme of Theorem[3] the rate region achieved is with Receiver 1 picking the sequence Xf 
such that Yi , l/E, Xf are jointly typical (See Appendix IVIII i. In contrast, in the decoding scheme for Corollary [3] 
the sequence Xj is decoded as the one such that (Y^',X'[) are jointly typical, i.e., the interference is treated as 
noise. 

C. Discussion : How many bits of additional rate can interference alignment provide ? 

The achievable scheme of Theorem [3] does not involve interference alignment and is therefore optimal, only 
when alignment is precluded on the many-to-one interference channel. The resolvability condition of the channel 
described in Corollary H] above is precisely one where alignment is precluded. However, in general, if the resolvability 
condition is not satisfied, then on comparing (111)-® with dTot - (fT~2l > . we can conclude that an additional rate of 
As = H(Ys\V, Us<=) — H(Ys\Us) should be achieved by alignment for the users belonging in S for the outer 
bound to be tight. It is not clear whether this additional rate can be achieved at all, in general, or whether the outer 
bound is loose. However, the results on the noisy interference regime imply that if the many-to-one interference 
channel is weak, then this additional rate can be achieved using interference alignment via random coding, and the 
outer bound is tight in a sum-capacity sense (Compare expression of Corollary [3] with (O). In other words, the 
extent of alignment is optimal in the sense that the additional rate benefit provided by alignment is the maximum 
possible. If the channel is not weak, then As can be interpreted as a bound on the amount of additional rate that 
can be obtained via alignment for the users in S C K%. 

D. An open question : When does a channel have a single-letter capacity characterization ? 

A clear open problem motivated by this work is a capacity characterization of deterministic many-to-one in- 
terference channels. This question is particularly intriguing because it is not clear whether the channel allows a 
single-letter capacity characterization. Previous works on approximating the capacity of the channel motivate the 
need of structured (lattice) coding based achievable schemes [4]. It has been discussed in [19] that for channels 
where structured codes are necessary, single-letter characterizations may not exist. This is because coding schemes 
such as linear and lattice codes introduce structure as correlations in multiple uses of the channel. Interestingly, 
single-letter based lattice coding schemes (i.e., single-dimensional lattices) are shown to suffice for a degrees of 
freedom characterization of almost all interference channels in [9]; however, it has been argued multi-dimensional 
lattices are useful for finer characterizations of capacity [27]. Thus, the question of existence of single-letter 
characterizations of interference channels, and more general wireless networks remains wide open. The issue of 
existence of single-letter capacity characterizations also appears in several broadcast channel scenarios. The study of 
degrees of freedom of compound broadcast networks [28], [29] suggests the possibility of alignment and structured 
coding in the channel, whereas, for certain degraded settings, the broadcast (multicast) channel allows single- 
letter capacity characterizations (See [30] and references therein). Thus, an important open question in network 
information theory is a better understanding of structured codes, and its impact on capacity characterizations of 
discrete memoryless channels. 

V. Conclusion 

We generalize the noisy interference regimes, previously shown in average-power constrained SISO Gaussian 
interference channels, to the discrete memoryless many-to-one interference channel. In this noisy interference regime, 
random coding at all transmitters and treating interference as noise at the receiver which faces interference achieves 
sum-capacity on the many-to-one interference channel. Our generalization enables extension of the noisy interference 
regimes to the Gaussian MIMO and parallel many-to-one interference channels and the fading Gaussian many-to- 
one interference channels without CSIT Unlike previous results which consider an average power constraint on 
the inputs, we also show that treating interference as noise is optimal in the Gaussian many-to-one interference 
channel, even if the inputs are constrained to come from fixed finite constellations such as QAM or PSK. Through 
the lens of interference alignment, we obtain a better understanding of why random (Gaussian) codewords are 
sufficient to achieve capacity in the noisy interference regime in the Gaussian interference channels. In particular, 



we argue that if users transmit, using random coding, at rates higher than the interfering link's mutual information, 
then the interference is noisy and the extent of alignment is maximum. Such alignment hence obviates the need for 
techniques such as structured (lattice) codes which have been shown to be immensely useful in other regimes. We 
also show that for deterministic many-to-one interference channels, if the interferers are resolvable at the receiver 
facing interference, random coding achieves capacity since interference alignment is precluded. While we are able 
to provide single-letter characterizations for certain classes of channels in this paper, the question of the existence 
of single letter characterizations for wireless networks, in general, remains open. 

Appendix I 

Equivalence of Our Noisy Interference Regime and that of [10]— [12] for Single- Antenna 
Gaussian Many-to-One Interference Channels 

We have already shown in the introduction that the noisy interference regime of [10]— [12] is included in the 
regime described by our result. We here show that our noisy interference regime is no larger than the regime 
described in [10]— [12]. In particular, we will show here that, if the conditions of (fl3 are not satisfied, then, the 
effective interference at Receiver 1 is not a degraded version of the set of all signals at the other receivers. In 
particular, we will show that, if 



K 

£ 



\Hn\ 2 
■ — — > 1 



then, for any possible set of values pi — E[Z\Z^,i = 2,3,... 
Xi, i — 1,2, ... ,K such that 

I(X 2 ,X 3 ,...,X K ;V\Y 2 ,Y 3 ,. 

where V — ^2,i— 2 HuXi + Z\. To see this, note that, since {Z 2 . 
Gaussian random variables, the correlation matrix of [Z\, Z 2 , . ■ ■ , 
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Note that the above matrix has to be a positive semidefinite matrix, whose determinant is non-negative. This implies 
that 



X> 2 <i 

i=2 



\Pm 



We are 



dT7T > and the above equation together imply that there exists io € {2, 3, . . . , K} such that i^ 1 '" \ > 
now ready to provide the input distribution for which ( 1 1 8b is satisfied. Consider the input distribution where Xj = 
deterministically for all i £ {2,3,..., K} — {io}, and Xi is a circularly symmetric zero-mean Gaussian random 
variable having some positive (non-zero) variance. Then, we can write 



I(X 2 ,X 3 ,...,X K ;V\Y 2 .Y 3 ,.. 
= I(Xi ; H\i Xi + Z\\Z\, Z 2 

> I(Xi ;Hii Xi 

> 



,Y K ) 



Zi\H ioio X io 



*io-l 
Z i0 ) 



Hi i Xi 



Zi a ,Z iQ ^ 



,z K ) 



(19) 
(20) 



where ( fl9] l follows from the independence of Xi and Zi, i — {2,3..., K}. In d20b . the equality is satisfied (for 

I H 'I 

Gaussian X^q) only if \pi \ = H — \ ■ However this condition is not satisfied for our choice of i , and we have, 



for this input distribution, I(X 2 ,X 3 , . . . , Xk\ ^1^2, Y 3 , . . . , Yk) > 0, i.e., the interference at Receiver 1 is not a 
degraded version of the outputs at the other receivers. This shows the equivalence between our characterization of 



the noisy interference regime and the characterization in [10]— [12]. It must be noted that we do not claim if (fTTb 
is satisfied, that the interference is not noisy - we only claim that it lies outside our characterization of the noisy 
interference regime. Whether the noisy interference regime characterized by us can be expanded is an interesting 
open question. 

Appendix II 
Proof of Lemma[T] 

From the Markov chain property on A, B, C, we can write 

I(A T ;B T ) = I(A T ;B T ,C T ) 
I(A T ;B T ) = I(A T ;C T )+I(A T ;B T \C T ) 
^h(B T )~h(C T ) = h(B T \C T ) + h(B T \A T )~h(C T \A T )-h(B T \C T ,A T ) 



(a) 
< 



(b) 



J2 h(B(r)\C(r)) + h(B(r)\A(r)) - h(C(r)\A(r)) - h(B(r)\C(r), A(r)) 

r=l 
T 

]T h(B(r)\C(r)) + h(B(r)\A(r)) - h(C(r), B{r)\A(r)) 

r=l 
T 

]T h(B(r)\A(r)) + h(B(r), C(r)) h(C(r)) - h(C(r), B(t)\A(t)) 

T=l 

T 

J2HB(T)\A(T))+I(B(T),C(T);A(T))-h(C(T)) 

r=l 
T 

h(B{r)) - I(B(r); A{r)) + I(B(r), C(r);A(r)) ~ h(C(r)) 

T=l 

T 



r=l 

where in (a), we have used the chain rule and the fact that conditioning cannot increase differential entropy in the 
first two terms, and the following facts in the final two terms above 

T 

p(C T \A T ) = l[p(C(T)\A(T)), 

T T T p{B T ,C T \A T ) ^ P (B(t),C(t)\A(t)) ^ 
P( B \ C > A )= p{C T lAT) =11 p{C (r)\A(r)) = 1[p(B{t)\C{t),A{t)). 

In (6), we have used the Markov chain property on A, B, C which implies that I(A(t); B(t), C(t)) = I(A(t); B(t)). 
Now, if Pb\a,Pc\b are both Gaussian, then the fact that the Gaussian input distribution on the A(t) with the 
appropriate covariance matrix maximizes /i(B(t)|C(t)) in step (a), combined with the convexity of entropy implies 
that i.i.d Gaussian distribution on the input maximizes h(B T ) — h{C T ). 

Appendix III 
Proof of Theorem[T] 

Achievability of the required rate follows trivially from typical set decoding arguments. We prove the converse 
here. We prove the converse for the case the differential entropies of Yk and Yk\Xk exist. The proof for the case 
the corresponding entropy terms exist, rather their differential entropies, is essentially identical, with the mutual 
information expressed in terms of the entropy of the variables. Consider any coding scheme of length T achieving 



rates Ri for user i. Then, from Fano's inequality, for any e > 0, we can write 

T{R\ - e/ K) < I(Y?;Xl) 

< h{Y?) - h{Y?\X?) 
= h{Y?)-h{V T ) 

T 

< ^Mn(r)) - h(V T ), (21) 

r=l 

where we have used © above. Now, note that since the capacity of the channel only depends the marginal 
distributions Pv\x K . 1 > Pvi |x< i * £ /Ci, we can make V a physically degraded version of Y/c 1 so that Xf — » 1^ — > V^ T 
without changing the capacity. We can now write, for i &K\. 

T(Ri-e/K) < I(Y?;XT) 

= h{Y?)-h{YT\X?) 

T 

= K Y i~) - E h ( Y i( T )\ X i( T )) 

T = l 

T 



VieKi / ie/Ci ^ r=l y 

T 

< fc^J-EE^'Wi^))- (22) 

ieKi r=l 

where the final equation follows from the fact that Y^ is independent of for all i ^ j. Adding (|2TT > and ( l22l . 

we get 



\»6/C / 



(23) 



T T 

<^^iW)-^ T )+M^)- EE^WI^W) (24) 

r=l ie/Ci r=l 

T T T 

< h (Yi(r)) + E (M^r)) - ft(F(r))) -EE M^)l^(r)) (25) 

T-=l T=l ie/Cl T=l 

= E (Vi (t)) - ft(V(r)) + E (MW) - MWI*i(r)))) (26) 
= E (m(r);X 1 (r))+ E /(^(r);F t (r))) (27) 

T 

= EE / ( y *( r );^( T )) < 28 > 

< TmaxE ^( T )) ( 29 ) 

ie/c 

<Tniax^I(I, ; y,) (30) 



where (EBT l follows from the fact that X£ — > — > V T combined with the result of Lemma Q] The outer bound 
hence follows. 



Appendix IV 
Proof of Corollary Q] 

The condition that YlieKi ^* — i m pli es that a Gaussian random Ni dimensional vector Z can be found so 
that V = J2i£K.! Ui + Z and Y\ = X\ + V, which means that V is a degraded version of U^. This fact, combined 
with the condition that Ui is degraded version of Yi implies that V is a degraded version of Y/c 1 as required by 
Theorem [T] The optimality of random coding and treating interference as noise hence follows from the theorem. 
Here, we only need to show that Gaussian inputs are optimal, when Xi = C Mi , and there is a power constraint on 
the inputs. Consider any achievable scheme where E Er^^W^W') = Then, following the steps of 
the proof of Theorem Q] we can derive equation (1251 . which is reproduced below. 

/ r=l t=1 t£KiT=l 

Here, we can use the Gaussian distribution to evaluate each of the entropy terms above. To see this, we invoke the 
convexity of entropy, and the fact that under a covariance matrix constraint, the Gaussian distribution maximizes 
entropy in the first term above. We can use Lemma Q] which shows that the use of the Gaussian distribution to 
evaluate the second and third terms above, outer bounds the terms. The final entropy term is evaluated using 
Gaussian because of the definition of the channel. Thus, if we have a power constraint, we are restricted to the set 
of all Gaussian distributions on the input at Transmitter i with covariance I\, where tr(I\) < Pj. 

Appendix V 
Proof of Lemma|2] 

Proof: The proof is similar to the proof of Lemma [2] We only highlight the differences here. We have 

A T ^B T ^C T 

I(l r ;B T ) = 7(^ r ; J B T |C T ) 
=> h(B T ) — h(C T ) = h(B T \C T ) + h(B T \A T )-h(C T \A T )~h(B T \C T ,A T ) 

(c) 

fc=l r=l 



< £I>( B(fc) ( r )|tf (k) (T-)) +h(B<V{T)\Ato(r) 

' =lr=l 

h (C (k) (t)|AW (t)) - h (bW ( T ) |C (fe) (r), ( 



= EEK B(fc)(r) )~K c(fc)(r) )' 

fe=lr=l 

where in (c), we have used the chain rule and the fact that conditioning cannot increase differential entropy in the 
first two terms, and the definition of the channels VbIa'PcIb m tne ^ na ^ two terms above, similar to Lemma Q] 
The arguments for the derivation of the final step from (c), and the optimality of Gaussian inputs if the channels 
are Gaussian, are identical to the proof of Lemma Q] and hence, omitted here. ■ 

Appendix VI 
Proof of Theorem[2] 

The bounds © and dHJ are trivial. We only need to show @. Consider any achievable scheme. Let T be the 
length of the code. Consider any S C K\. Then, for i s S, from Fano's inequality we can write for any e > 0, 

T(Ri-e) < H(Y?), (31) 

^T\Y,Ri-\S\A < J2 H( - Y ^= H ( Y s) = H ( Y sPsc), (32) 
Vies / ies 

where, in the final two equations, we have used the fact that (Yi, Ui) is independent of (Yj, Uj) for j ^ i, € ICi. 
Now, using Fano's inequality for W\, we get 

T(Ri-e) < I(Y?;Xl) (33) 



< JQf,[£;*f) (34) 

= /(if^C/Jo) (35) 

= H{Y^\Ul)-H{Yl\X^Ul) (36) 

< TH{Y 1 \U S c)-H{V T \Ul <l l (37) 

where, we have used (|2), and the convexity of the conditional entropy function above. Summing d32l and d37l i. we 
get 

Tl^+YsRi-Ke) < Ti/(y 1 |(7 lS c) + ff(Y s T |;7jc)- J ff(F T |f/Jc) (38) 

< TH(Yi\Us.) + H(Yg\v T ,u&) (39) 

In ([39|, the fact that for any arbitrary variables A, B, C, H{A\C) - H{B\C) < H(A\B,C). Dividing the final 
equation by T and taking T — > oo, we get the desired bound. 

Appendix VII 
Proof of Theorem[3] 

We provide a random coding achievable scheme along the lines of reference [26], which studied the deterministic 
Z channel. Without loss of generality, let us assume that W, = {1,2,..., 2 TRi }. Consider any product distribution 

UieKPXiixi). 

Encoding: The first transmitter generates 2 TRl independent codewords Xf generating each element i.i.d ac- 
cording to px 1 (xi). Let the generated sequences be denoted by Xf(m),m G {1,2,..., 2 TRl }. Then the message 
W\ — m is encoded using Xf(m). Now consider Transmitter i G JC\. Note that pxi(xi) along with the channel 
induces px it Ui Y< (%i, u i, Hi) from which marginal distributions pu i (u^), jjyj (j/j) and PUi,Y% ( w i> Vi) can ^ e calculated. 
The transmitter generates 2 Tfli sequences of Uf , each sequence generated independently and i.i.d according to 
Pi/i(ui), where f2j > 0. We denote the mth sequence generated as Uf{m), where m G {1, 2, . . . , 2 T * }. The 
transmitter also generates 2 TH ( Yi " > sequences of Y? ', each sequence generated independently and i.i.d according to 
PYi (Ui)- These sequences of Y^ are distributed uniformly into 2 TRi bins. To encode the mth message, the transmitter 
picks a Y? sequence in the mth bin, such that it is jointly typical with Uf(mi) for some rrii £ {1,2,..., 2 4 }. 
If no such sequence is found, then an error is declared. Otherwise, the message is encoded using the Xf which 
generates the (Uf, Y^) sequence picked. The existence of such a Xj sequence is guaranteed, because the channel 
is deterministic and the pair (Uj ', Y^), by virtue of being jointly typical, has a non-zero probability of occurrence. 

Remark: The encoding strategy at Transmitter 2 is similar to the optimal coding strategy over the deterministic 
broadcast channel [31]. 

Decoding Strategy : Receiver 1, on receiving Y± , chooses the unique index W\ = m, such that 

(If, t/ 2 T (m 2 ), [/ 3 T (™ 3 ), . . . , UZ{m K ),X?(m)) 

is jointly typical, for some 6 {1,2,.. .,2 Q -} for i = 2, 3,..., K. An error is declared if no such unique index 
m is found. Receiver i G K\ can decode Wi using the bin-index of the received Y? sequence. Note that since the 
channel is deterministic, there are no errors at Receiver i ^ 1, if the encoding at Transmitter i is successful. 

Error Analysis and Achieved rate: Since the coding scheme is symmetric over all messages, we will analyze the 
probability of error assuming Wi = 1 is encoded at transmitter i e /C; because of symmetry in the coding scheme, 
the probability of error of encoding this set of messages gives the probability of error averaged over all messages. 
Now, we divide the possible set of errors into two types : errors at Transmitter i € K.% and errors at Receiver 1. 

For i G JCx, let 

, . _ J At Transmitter i, Yf belongs to the m^th bin, Uf — Uf(m) 
i(m ^ ~ \ for some m G {1,2, . . . ,2™-} (Uf ,Y?) £ A(U h YA 

where A e (Ui, Yi) represents the e-jointly typical set of (Uj ' , Y^) pairs. Note that Ei{rrij) corresponds to the event 
that no jointly typical pair Y- 1 \\Jj was found in the rn^th bin at Transmitter i when encoding Wi = mj. The 
overall probability of error can now be expressed as 

Pj = Pr (^(!)) + Pr(Decoding error at receiver l|-Ef (1), E% (1), . . . , E° K {1)). 



Now, consider Receiver 1. Note that, at this receiver, the decoding procedure and hence the error events are very 
similar, in nature, to the errors that can occur over a multiple access channel (MAC), when the asymptotically 
optimal typical set coding procedure is used. The only difference is that, in this particular case, the receiver is only 
interested in one message, i.e, W\, which reduces the number of possible error events as compared to the classical 
MAC. Now, given that the message W, — 1 is encoded at Transmitter i for i € K, and no errors occurred at the 
transmitter, a sequence Uf is found at the transmitter such that it is jointly typical with a sequence in the first 
bin. Let us assume, without loss of generality, that this sequence found is Uf{l), i.e., Uf (l) is used in the encoding 
procedure at Transmitter i, for i ^ 1. If Ef(l) occurs for i ^ 1, then, because of the deterministic nature of the 
channel, the received sequence Y± is jointly typical with £/J(l), U^(l), . . . , E/j£(l), X^l). Errors can occur if 
Y^{\) is jointly typical with U 2 (m 2 ), /7j(m 3 ), . . . , U^mx), X^mi) for some mi ^ 1, m, e {1,2, ... , 2 Tf2i } 
for i £ JC\. We wish to evaluate the probability of occurrence of this event. Let us define 

E 1 (m 1 ,m 2 ,m 3 ,...,m K ) = { , U 2 T {m 2 ), U^(m s ), . . . , U^(m K ), Xf(rai) is jointly typical} 

By the union bound, we can bound the error at Receiver 1 as 

Pr(Error at Receiver E%(1), E C K {1)) 

2 TR 1 

< 2_j E Pr(Ei(mi,ni2,...,m K )) 

mi =2 m,6{l,2,...,2 n i},i:EKi 
2 TR i 

= E E E Pr(E 1 (m 1 ,m 2 ,...,m K )) 

mi=2SCKi m;^l,i£S, 
m^ — 1, i£S c 

2 T Rl 

= E E E P r ( E i( m i, m 2,---,m K ) 

SCIC 1 mi=2 m;^l,ieS, 

Now, the overall error probability can be bounded as 

2™i 

P e < J2 J2 E ^(E 1 (m 1 ,m 2 ,...,m K ))+ ]T Pr(£fc(l)). (40) 

SC/Ci mi=2 mi^l,ieS, ie>Ci 
mi = l,ieS c 

It has been shown in [32] that if, for i ^ 1, 

Ri < H(Yi) (41) 
Ri < £l l + H{Y i )-I{Y l -U l ) 

= n t + H(Y l \U l ), (42) 

then, asymptotically as T — > oo, the probability of Ei(l) occurring vanishes. Now, we estimate the remaining term 
in d40b below. Let A € denote the set of all e-jointly typical sequences of (Y\ ■> U 2 , ■ ■ ■ , Uf^,Xf). Then, for any 
S C fCx, we can write 

2 TR 1 

^ Pr(E 1 (mi,m 2 , ...,m K )) 

m 1 =2 mi#l,ie<S, 
mi — l,iG(S c 

2 TR 1 

^ E E ft (ft r , ^ T K), c/ 3 T K), ■ ■ ■ , t/|K), li T (mi)), e A) 

mi— 2 rrii^l,i£$ , 
mi = l,i£S c 

< E E E Pr ( r i T i^) Pr ( x i T (-i) - *D II Pr (^K) = ^) 



fa) 2TRl 

< |^| 2 -TH(Yi|£/ 5 =)+e2- TH "C^ 1 )-Tff(Xi))+i<re 
mi=2 mi^l, i£S, 



2 J 



mi=2 m</l,»e5, 
mi=l,ieS c 

= 2Tfli 2 TEi es ™,2-TH(y 1 |;7 S c)+(i<'+2)£ 

where, in (a), we have used the fact that C7j is independent 17-/ for j ^ j ■ From the above equation, it can be 
concluded that, if, 

Ri+^Sli^BOfilUs^VSCJC! (43) 

ies 

then, asymptotically as T — > oo, the probability of error at Receiver 1 vanishes. Note that if 5 is the null-set, the 
above equation can be equivalently expressed as R% < H(Yi\V). Thus, if the rates Ri,i 6 /C and the parameters 
Hi) i S /Ci satisfy (f4Tb . (t42l > and d43l . the rate-tuple i?2, ■ ■ • , -Rif ) is achievable. Eliminating fi,, i G /Ci from 
these inequalities using Fourier-Motzkin elimination, we get the achieved rate region in the desired form. 
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